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-ABSTRACT- 


The  problem  of  deadlock  in  distributed  data  base 
management  is  analyzed  in  terms  of  performance  effects  of 
potential  deadlock  handling  schemes.  The  performance 
tradeoffs  of  deadlock  detection  and  deadlock  prevention  for 
distributed  data  base  management  systems  are  compared. 
Sinco  the  run-time  overhead  in  deadlock  prevention  is 
projected  to  be  less  than  for  deadlock  detection,  an 
algorithm  for  preventing  deadlocks  in  distributed  data  base 
systems  is  developed.  The  critical  information  for  the 
deadlock  prevention  algorithm  is  maintained  in  a  shared 
record  list.  The  shared  record  list  contains  all  shared 
access  records  for  a  set  of  tasks.  Shared  records  lists  are 
maintained  dynamically  by  the  run-time  system.  A  proof  that 
the  algorithm  prevents  deadlocks  in  a  distributed  data  base 
management  system  is  provided  along  with  a  comprehensive 
example . 
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The  problem  of  deadlock  in  distributed  data  base  management  is 
analyzed  in  terms  of  performance  effects  of  potential  deadlock  handling 
schemes.  The  performance  tradeoffs  of  deadlock  detection  and  deadlock 
prevention  for  distributed  data  base  management  systems  are  compared. 
Since  the  run-time  overhead  in  deadlock  prevention  is  projected  to  be 
less  than  for  deadlock  detection,  an  algorithm  for  preventing  deadlocks 
in  distributed  data  base  systems  is  developed.  The  critical  information 
for  the  deadlock  prevention  algorithm  is  maintained  in  a  shared  record 
list.  The  shared  record  list  contains  all  shared  access  records  for 
a  set  of  tasks.  Shared  records  lists  are  maintained  dynamically  by  the 
run-time  system.  A  proof  that  the  algorithm  prevents  deadlocks  in  a 
distributed  data  base  managment  system  is  provided  along  with  a  compre¬ 
hensive  example.  A  discussion  of  the  efficiency  of  the  deadlock 
prevention  algorithm  indicates  that  partitioning  the  data  base  into 
sub-schemas  reduces  the  overhead. 
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1  lut  roclncl  ion 

One  of  the  major  trends  in  computer  systems  is  toward  the  decentralization 
of  resources  and  facilities.  This  phenomenum  has  lead  to  a  requirement  for 
distributed  data  base  management  systems.  A  distributed  DBMS  permits  an 
application  program  executing  on  a  processor  in  a  computer  network  to  access 
data  on  any  other  node  in  the  network.  In  an  optimal  situation  the  only 
limits  on  data  access  are  communication  linkages  and  security.  A  distributed 
DBMS  relies  upon  its  underlying  network  for  communication  facilities.  The 
basic  data  base  software  in  a  distributed  system  is  functionally  identical  to 
a  centralized  (one  machine)  system. 

From  the  preceding  statements  it  appears  that  a  distributed  DBMS  can  be 
realized  by  interfacing  a  centralized  data  base  system  with  a  network  communicate 
facility.  However,  as  indicated  by  Fry  and  Sibley  [10],  when  a  data  base 
system  is  extended  to  operate  over  several  machines  a  whole  new  class  of 
problems  arises  while  many  existing  problems  become  more  complex.  Among 
those  problems  that  are  complicated  by  the  distribution  of  the  data  base 
is  that  of  deadlock.  This  paper  proposes  a  method  for  preventing  deadlock 
in  a  distributed  data  base  system.  The  algorithm  stated  here  is  intended  to 
avoid  deadlock  in  a  manner  that  is  transparent  to  the  application  program  and 
wiiii  minimi  effect  on  processes  that  would  not  be  involved  in  a  deadlock 
s  i  t  tint  i  on  . 

2.  )>  i  :■ ;  r  i  bn  I  ■  -d  Data  Bases  Systems 

Before  the  presentation  of  the  deadlock  prevention  algorithm  can  occur, 
t e i m i no  I  <>gy  concerning  data  base  management  systems  in  general  and  distributed 
data  ha.e  systems  in  particular  must  be  established.  A  schema  is  a  logical 
desi  i  i pi  ion  of  t  data  base.  The  portion  of  a  schema  that  may  be  accessed  by 
a  pal  I  Molar  application  program  is  specified  by  the  suli-schetna  associated 
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with  th.it  program.  The-  sub-schema  define:;  the  data  records  that  the  program 
may  operate  upon  and  indicates  the  logical  relationships  among  the  records 
in  the  data  base. 

In  a  central ized  data  base  system,  the  application  program,  sub-schema 
and  data  base  software  all  reside  on  one  processor  which  is  physically  connected 
to  the  secondary  storage  containing  the  data  base.  A  distributed  data  base 
management  system  has  resources  and  their  control  spread  among  the  processors 
of  a  computer  network.  In  a  distributed  DBMS,  an  application  program  executing 
on  one  network  node  may  access  data  that  reside  at  several  distinct  nodes.  A 
computer  that  executes  data  base  application  programs  is  a  host  machine.  A 
back-end  machine  is  one  which  controls  access  to  data.  A  machine  with  both 
capabilities  is  termed  a  bi-function.nl  machine.  Figure  1  protrays  a  distributed 
DBMS  with  host,  back-end,  and  bi-functional  nodes.  A  discussion  of  the 
general  organization  of  distributed  data  bases  can  be  found  in  Reference  [18], 

In  order  to  implement  a  distributed  DBMS,  the  software  required  for 
a  centralized  DBMS  is  necessary  plus  some  communication  and  control  software. 

As  indicated  in  Reference [ ig ]  communication  between  application  programs  and 
data  base  tasks  can  best  be  accomplished  by  means  of  a  generalized  message 
system  which  is  capable  of  handling,  communication  among  heterogeneous  machines, 
•’no  mi  ss.igc  system  must  .also  allow  for  transmission  over  a  wide  range  of 
i n 1 1  ■  r-m.u  h i no  connections,  from  conventional  phone  lines  (typically  1200  baud) 
to  shared  memory  (a;  i  ;..al  el  y  10M  baud). 

A  group  of  machines  tied  together  via  memory- to-memory  linkages  is  known 
as  a  cluster.  A  network  can  be  formed  from  a  collection  of  clusters  joined 
by  lower  speed  connect i mis .  In  a  distributed  DBMS  communication  between 
macs i nee  in  the  same  cluster  can  occur  with  little  overhead.  However, 
intert  iu  t.  t  i-nMimiiic.it  iim  may  result  in  noticeable  performance  degradation. 
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Tin'  impact  of  iiUerni.irh ino  communication  upon  system  performance  is  an 
important  consideration  in  the  treatment  of  deadlock  in  a  distributed  DISMS. 

3 .  Deadlock  in  a  distrih  i  ited  DBMS 

In  general , deadlock  occurs  when  two  or  more  processes  request  a  set  of 
shared  resources  in  a  sequence  that  results  in  the  activation  of  each  process 
being  dependent  upon  the  acquisition  of  a  resource  held  by  another  process 
whose  activation  is  also  blocked.  A  considerable  amount  of  study  has  been 
done  on  the  deadlock  problem  of  operating  systems  [5,6,9,12-14,19-20].  The 
particular  form  of  deadlock  under  consideration  in  this  report  involves 
processes  that  are  data  base  application  tasks  residing  on  (potentially) 
different  machines  in  a  data  base  network.  The  shared  resource  is  a 
collection  of  records  dispersed  among  the  data  bases  of  the  system  that  may 
be  updated  by  more  than  one  application  program.  It  is  assumed  that  no 
user  imposed  restrictions  concerning  accessibility  have  been  placed  upon  the 
data  records.  This  form  of  deadlock  is  a  DBMS  problem,  not  a  user  problem. 

The  means  of  treating  deadlock  must  be  totally  transparent  to  the  application 
program . 

The  special  difficulty  of  deadlock  in  a  distributed  data  base  is  that 
since  there  is  no  central,  control  point  in  the  network,  the  responsibility 
lor  notin;;  the  actual  or  potent ial  occurrence  of  a  deadlock  situation  can  not 
easily  he  asr;i;s.<  .  .  c  task  requesting  the  resource  can  be  informed  that  the 

request  was  unable  to  be  fulfilled.  However,  if  that  resource  is  controlled 
by  one  (nr  more)  different  processor;;,  considerable  manipulation  and  intermachin 
common i ca t i on  would  be  required  to  determine  if  a  deadlock  situation  exists 


A.  iV.u.iocl;  Dipt  or  r  i  on  .md  Prevention 

There  art'  two  basic  mechanisms  for  treating  the  deadlock  problem  [A]. 
One  approach  is  to  prevent  deadlock  before  it  can  occur.  Deadlock  prevention 
requires  a  prior  knowledge  of  the  shared  records  to  be  operated  upon  by  all 
active  application  tasks  in  the  system.  The  alternative  to  deadlock  pre¬ 
vention  is  deadlock  detection  which  involves  noting  the  existence  of  a 
deadlock  situation,  and  then  resolving  the  dilemma.  Generally,  a 

deadlock  situation  can  be  resolved  by  halting  one  of  the  competing 
processes  and  freeing  its  resources  for  access  by  other  processes.  However, 
task  "rollback"  is  detrimental  to  system  performance  and  in  some  cases  in- 
feas  LMe . 

In  a  distributed  DBMS,  both  deadlock  prevention  and  detection  produce 
considerable  overhead;  particularly  if  intercluster  communication  results. 
Deadlock  prevention  in  a  distributed  data  base  requires  that  records  that  may 
be  she. red  among  several  tasks  (and  updated  by  at  least  one)  be  identified. 
For  this  in  format  ion  to  be  meaninglnl,  only  the  records  shared  with  currently 
active  tasks  should  be  included.  This  implies  that  whenever  a  task  that  up- 
dates  -a.i'.!  records  enters  the  system  the  list  of  shared  records  must  be 
rev  i  sod .  Van-ii  this  has  taken  place,  the  new  task  can  proceed.  Whenever  a 
•aoc  to  access  a  shared  record  arises,  a  prevention  algorithm  can  be  invoked 
'<■  ‘a  ter;,;::,  i  i  the  access  may  proceed.  If  not,  the  task  is  blocked  until 
L  iir  r «■  t'.s  «i  !  •  •  j.v,. 


I  be  ;  s .  >  in  cause  el  overhead  in  deadlock  prevention  in  a  distributed  DBMS 
1  ;  1  hi  •  pul. it  i  *  *  1 1  and  communication  of  the  list  of  shared  records  for  active 
*a  a1..  la.  .  i  ,  a  con!  innal  oper.it  ion  which  must  occur  whenever  tasks  are 
( fa:  eb  i.ui  Jos  l  loved.  As  in  all  prevention  schemes,  some  time  is  devoted 
to  avoiding  d » s  i  *  1 1 . .  1 1 ,  s  which  would  not  have  occurred  during,  a  particular  execution 
■'  '  '  "  :  h  IV‘  -‘(  cess  rights  to  many  more  records  than  it  actually  uses. 
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Deadlock  detection  schemes  represent,  on  n  posteriori  approach  to  the. 
problem  of  avoiding  deadlocks.  In  a  distributed  data  base  system,  deadlock 
detection  involves  first  identifying  a  set  of  two  or  more  tasks  blocking  each 
other  from  a  collection  of  shared  records.  Generally  a  "timeout"  mechanism 
which  involves  noting  that  the  effected  tasks  have,  been  waiting  for  longer 
than  some  fixed  time  is  used  in  deadlock  detection.  Once  the  set  of  dead¬ 
locked  tasks  and  the  conflicting  resources  are  identified,  one  of  the  tasks 
must  be  rolled  back  to  some  point  that  will  free  the  resources  necessary  to 
break  the  deadlock.  Rollback  involves  restoring  all  data  to  the  values  held 
before  the  operations  being  retracted  were  performed.  In  a  distributed 
environment,  the  data  base  operations  may  be  initiated  on  one  host  processor 
and  carried  out  by  several  different  back-end  processors.  Rollback  action 
would  have  to  be  initiated  by  the  host  processor  and  then  carried  out  by  the 
back-end  processor  in  a  manner  analagous  to  the  execution  of  a  standard 
data  base  access.  This  will  necessitate  considerable  message  transmission 
activity  to  start  and  synchronize  the  rollback  operation. 

An  additional  negative  performance  factor  in  the  deadlock  detection 
approach  is  that  tasks  not  involve  i  in  the  deadlock  situation  can  be  effected 
if  they  have  arc,  cd  dita  •..ritr-n  by  the  sequence  of  commands  being  rolled 
back.  In  this  c.i.-e,  the  t.:  .!  nr.  .-.it:,;  the  data  would  also  have  to  be  rolled 
hack.  It  is  ;s's. ihli  for  the  rolllacV  to  cascade  throughout  the  system  in  the 
worst  case. 

Tile  deadliik  iht..  lion  a  1  go  I  i  t  ht described  in  iclorcnccr.  [2,3,7,15]  all 
reipiiii-  a  dyn.ir.ii  1  i  t  of  piece-.,  e  •  and  the  records  that  they  access.  This 
into!  m..t  ion  is  siMil.ir  to  that  r«  <ju  i  i  ed  in  a  deadlock  prevention  .scheme. 

Acipi  i  i  in;;  and  maintaining  an  accessibility  list  for  deadlock  detection  could 
rcipiiie  a  heavy  coi.miiin  i  c  at  i  on  load  in  a  distributed  system  ns  in  the  case  of 
do. si  1 1 s  !  prevent  i  on  . 


■  an 


im 


A  nwii'.'irisnn  of  t  ho  two  alternatives  for  avoiding  deadlock  ind  leal  or, 
lh.it  hot  ii  t  y|HT,  of  algorit  hms  would  require  some  form  of  a  record  access¬ 
ibility  list.  Since  deadlock  prevention  requires  continual  computational 
and  communication  activity  to  avoid  deadlock  whereas  deadlock  detection 
measures  ate  most  likely  to  be  invoked  infrequently,  there  is  potent  ion a  1 1 y 
more  overhead  in  deadlock  prevention.  However,  the  overhead  is  fixed  and  the 
previntiea  algorithm  has  no  effect  upon  processes  that  may  not  be  involved  in 
any  deadlock  situations.  In  a  distributed  data  base  system,  the  rollback  and 
timeout  r.e>  bar.  isms  of  deadlock  detection  could  result  in  substantial  corn- 
put  at  tea  and  com.r.iu.i  i  r.i t  i e:i  overhead  and  also  the  rolling  hack  or  blocking  of 
tasio-  not  involved  in  the  deadlock  situation. 

because  of  the  uncertain  and  potentially  serious  performance  degradation 
that  may  result  from,  rollback  in  a  distributed  DBMS  deadlock  prevention  is 
a  safer  strategy  for  handling  deadlocks  in  a  distributed  data  base  environ¬ 
ment.  Thus,  this  paper  concent  rates  upon  deadlock  prevention. 


'•  •  i a . .  .  t  i  "a  Rc<;,« ;  red  t_o  Prevent  Deadlock 

in  a  d  i  •.  t  i  in!  ed  PPM?  utilizing  the  proposed  prevention  algorithm, 
each  Mol.-,  s  i  processor  will  he  responsible  for  preventing  deadlock  situations 
invo,  pet:  ion  of  t  he  data  base  under  its  control.  Since  the  back-end 

i’t'c  •  :  ;  crier;..'-  the  function  of  manipulating  the  data,  it  is  best  suited 

>  o  su  •  the  re:p.,n;ii  ■  /  for  deadlock  prevention.  In  this  way,  the 

I'tci  a '  i  ;;  dead  1  nek  is  removed  from  the  application  task. 

I'1’1  *  •  e  n  active  f.iuk  that  it  is  serving,  the  back-end  pi  occccor  maintains 

a  list  :  i  ■  on):,  t  hat  its  i  y  he  accr.  ed  by  several  tasks  and  updated  by  at 

CM  ,i  :  ■  :  .  ;  . 

1  hi  s  1 1  ■  1 1  i  *  i  i  >  t  1 1  1  i  ■  i  is  di  ■  i  i  vi  1 1  f  rom  the  s  ul>  -  ;u:  In  :ia  of  t  a ' :  k  .  When  a 

*  *  ■  hiit  t ..  t  its  ■dialed  t'eii'iii  list  is  circulated  among  tlu*  back-end 
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jii  1. 1  . (<i  ii«- (t  rut  i  in-  il  .iiiy  1  u(  ir.ut  I  i'll  with  other  I  :>:.k  cxJsl.s.  A 
list  <  >  I  inti'iael  i  tig  I  ;i::k:.  is  in.l  l  li  I  a  i  il< -d  l>y  the  back -end  processors.  Ill  oide 
to  min  .inline  tin-  common  icat  ions  overhead  upon  task  initiation,  each  back¬ 
end  processor  can  maintain  a  list  which  indicates  those  back-end  processors 
which  may  control  records  shared  with  any  given  sub-schema .  Only  these  hack 
ends  with  potential  interaction  need  be  contacted.  Upon  task  termination 
similar  action  must  be  taken  to  withdraw  the  task  and  its  records  from  the 
task  interaction  and  shared  record  lists.  The  shared  record  list  is 
conceptually  similar  to  the  process  set  of  Chu  and  Ohlmacher  [A]. 

6.  IVnd 1 ock  Prevent i on 

An  algorithm  for  t lie  prevention  of  deadlock  in  a  distributed  DBMS 
is  developed  in  this  section.  Initially  some  notational  conventions  must 
be  established. 
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iv  f  i 


list 

for 


r.  1 1  i  nil  J  ( Ni  ■!  at  i  on ) 


1.  r.  -  a  record  in  the  data  base 

J 

2.  Tv  -  on  application  task. 

ft 

3.  Rj,  -  potential  shared  record  list  of  T^,.  A  set  of  shared  records 

is  accessed  by  several  tasks  and  updated  by  at  least  one. 


X... 


a  task  interaction  list.  A  set  of  tasks  whose  potential 
shared  record  lists  have  non-empty  pairwise  intersections. 


3.  S^,  -  the  shared  record  list  of  X^,.  All  records  appearing  in  more 

than  one  potential  shared  record  list  of  the  tasks  in  X^. 

6.  B,  -  the  back-end  processor  executing  a  data  base  request  for  T^. 

7.  S^,  -  the  shared  record  list  of  a  set  of  tasks  T  on  back-end 

’  processor  K.  A  record  in  a  si;  .red  record  1  i •  t  is  marked  witl 
a  task  identifier  when  it  is  requested  or  locked. 


S.  m  (STjR) 


the  number  of  distinct  tasks  that 
in  ST,K- 


have  records  marked 


9.  L  (ST) 


the  number  of  distinct  tasks  that  have  records  locked 
in  Sr 


For  a  given  task  interaction  list  X^,  a  copy  of  S^,  the  shared  record 
,  is  maintained  on  each  back-end  processor  executing  data  base  operation; 
a  task  in  X  . 
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In  order  to  properly  prevent  deadlock,  the  state  of  tlio  system 
immediately  prior  to  a  deadlock  state  must  be  described  and  recognized.  If 
an  algorithm  can  develop  which  insures  that  the  distributed  DBMS  will  never 
enter  a  state  that  can  immediately  lead  to  deadlock,  then  the  algorithm  will 
prevent  deadlock.  First  let  us  formally  define  deadlock. 


Definition  2 

A  set  of  tasks  T  =  {Tj ,T2 , . . . , Tm) ,  m  >  2 ,  is  dead  1 ocked  if  for 

1-  i  '  m-1 ,  T-  is  blocked  by  T...  and  T  is  blocked  by  T,. 

J-  l+l  m  •’1 


Example  1 


Assume  there  are  five  tasks  ,  T 2>  T^,  ,  T^,  active  in  the  system, 

bet 


Ri  =  {tV 

V 

r3} 

r2  “  {rr 

rA  ’ 

V 

r3  =  {v 

V 

r7} 

RA  =  {V 

r9} 

R3  ^r8’ 

r9} 

X1  =  U 1 ’ 

T2} 

X2  ~  ^  1 1  ’ 

T.,} 

x;j  iV 

V 

x4  ^  {v 

[2' 

T3  ' 

X,  -  {T  ,  T.  I 
>  A 

and 


51  <rJ] 

52  '  {rA] 


S3  =  (r7) 


S5  =  {V 


SA  =  {rr  *  2  ’  r7} 
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I 


It  is  l>  locked  l>y  in  blocked  by  ,  and  is  blocked  by  , 

then  X  i;'>  deadlocked. 

H 

i)e  f  i  nit  i  on  2 _ 

A  set  of  tasks  T  =  {t  ,  T„ ,  ...  T  }  ,  m  >  2,  is  in  a  deadlock-prone 

1  /  m  - - 

state  if  there  is  a  sequence  of  unf  ul  f  il  lable.  requests  that  can  be  issued 
by  the  tasks  in  T  that  will  place  T  in  a  deadlocked  state. 


Lemma  1 

A  set  of  tasks  cannot  enter  a  deadlock  state  without  first  entering 
a  deadlock-prone  state. 

This  result  follows  immediately  from  Definitions  2  and  3. 


Example  2 

Consider  the  set  of  tasks  in  the  previous  example. 

Assume  that 

has  locked  r^; 
has  locked  r^; 
and  has  locked  r^. 

T  is  in  a  deadlock-prone  state  since  the  following  sequence  of  commands 

result:,  l n  a  deadlock  state: 

Tj  requests 

T  re(|m':;  *  i  .. 

2  / 

t3  rr 

since  Tj  will  be  blocked  by  will  he  blocked  by  T  ,  and  T,,  will 

he  b  I  o,  l.i  d  b  v  T  . 

iein.ri  2  indji.iii:;  a  met  hod  for  do  t  ec  t  i  n  g  the  existence  of  a  deadlock-prone 


J 
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i.emm.t  2 

A  set  of  tasks  T  is  in  a  dcadlock-pronc  state  if  and  only  L  (S.^,)  =  |t|. 

J’roo  f 

Let  T  -  {T  ,  T  ,  . . .  T,  }  where  the  numbering  of  tasks  is  arbitrary, 

X  /.  K 

k  >  2. 

We  will  first  show  that  if  L  (ST>  =  |t|,  then  T  is  in  a  deadlock-prone 
stat  e . 

Lot  { r^ ,  r^,  ....  rk)  £  ST- 

Let  Qq  be  the  state  of  the  system  when  L  (S^,)  =  |t|. 

Since  L  (S  )  =  [t|,  each  task  in  T  must  have  at  least  one  record  locked. 
We  can  assume  that  for  1  <  i  <  k,  r.  is  locked  by  T.. 

l  l 

Since  each  record  in  S^,  is  contained  in  the  intersection  of  the  record 

lists  of  at  least  two  tasks,  we  can  assume  that  for  1  <  i  S  k-1,  r  ,cP  n  R.., 

lf-1  i  ;.+l 

and  r,  c  R.  n  R,  . 

1  k  1 

From  system  state  Qq,  let  task  T\  request  record  r^+^,  1  si<k**l,  and  let 
task  request  record  r^. 

The  system  will  then  enter  state  in  which  the  following  condition 
liol  ds  : 

is  blocked  by  ; 
is  blocked  by  1  ^ ; 

T.  ,  is  b i o  :  '  i  T,  ; 

K  "■  -i 

T.  is  blocked  i>y  T,; 

K  1 

Thus  .state  y  is  a  deadlock  state. 

From  Definition  S,  it  follows  that  is  a  dead ! ork-prono  state. 

'iiieieioce  i.  (;;  )  |  T  J  implies  that  T  is  in  a  deadlock-prone  state. 

Jt  must  now  he  demonstrated  that  the  existence  of  a  deadlock-prone 
State  n, .p  I  ies  t  hat  I.  .)  '  |  T  j . 


mam MadMn — m  1 


JU  ■ 


hot  hi-  .1  don  1  nek-prone  state  t.uoh  thnl  there  is  a  sequence  of  unsat- 
isfiabK-  requests  which  lend  to  deadlock  state  . 

Assume  that  the  tasks  in  T  art1  blocked  in  state  as  shown; 
is  blocked  by  ; 

is  blocked  by  ; 


is  blocked  by  T^; 

T  is  Mocked  by  T^; 

If  a  task  is  blocking;  another  task,  the  intersection  of  their  record 
lists  must  be  non-empty.  Therefore,  there  is  a  set  of  records  { r 1 ,  r„ ,  . . .r  } 
£  S,j,  such  that 


for  1  •  i  <  k-1  ,  r.,.c.R.  n  R.  ,  , 
l+l  l  l+l 

and  r^c  n  Rj. 

Since  each  task  in  T  is  blocking  another 
have  at  least  one  record  locked  in  state  . 


task,  each  task  in  T  must 
Therefore  in  state  Q^, 


b  (s,f)  -  i t ; . 


Since  was  reached  from  by  a  sequence  of  unsatisf iable  requests, 
the  sci  of  records  locked  in  is  identical  to  the  set  of  records 
locked  in  t,1^.  Therefore  in  state  Q^,  L  (S^,)  =  | T |  where  is  a  deadlock- 
prone  : '  i  .>t  •• . 


i-'.x .  no  - .  i  < 

In  tin-  do.id  lool.cni  and  dead  look-prone  states  of  proceeding  examples, 

)  r.  ,  r  ,  r  ,)  where  the 
‘  1 1  *  3  1  2  \ 

intojpi  .  beneath  the  record.*;  indicate  the  task  lock  ini;  the  record. 

Thus,  1.  (;'>  )  1. 
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ii.  An  Algorithm  for  1 1  it'  Prevention  of  Dead  1  oc.k 

From  Lemma  2,  we  can  see  that  if  L  (S,^,)  <  |T|  —  1  for  all. nets  of  shared 
records,  then  the  system  will  be  free  of  deadlock.  This  relationship  between 
the  number  of  tasks  potentially  and  actively  sharing  data  and  the  occurrence 
of  deadlock  forms  the  basis  for  a  deadlock  prevention  algorithm. 

Three  commands  and  a  response  are.  necessary  for  operation  in  a  deadlock- 
free  environment.  All  commands  and  responses  are  transmitted  among  back-end 
processors.  The  commands  are  LOCK,  UNLOCK  and  REQUEST.  When  a  task  desires 
to  update  a  shared  record,  the  back-end  processor  controlling  that  record 
issues  either  a  LOCK  or  REQUEST  command  to  the  other  back-end  processors 
controlling  records  of  tasks  in  any  of  the  task  interaction  lists  of  the 
requesting  task.  The  decision  as  to  whether  LOCK  or  REQUEST  is  sent  is  based 
upon  relative  task  priorities.  LOCK  commands  are  sent  to  the  back-end  pro¬ 
cessors  of  lover  priority  tasks,  while  back-end  processors  serving  tasks  of 
higher  priority  receive  REQUEST  commands.  If  a  back-end  processor  that  has 
received  a  REQUEST  command  determines  that  the  record  is  available,  it  issues 
a  POSITIVE  response.  It  is  important  to  note  that  under  the  deadlock  preven¬ 
tion  algorithm  a  negative  response  is  not  necessary  since  a  back-end  processor 
issuing  a  REQUEST  for  an  unavailable  record  or  a  REQUEST  that  would  lead  to 
dead  jock  will  receive  a  LOCK  command  which  invalidates  the  REQUEST.  The 
UNLOCK  command  relinquishes  control  of  a  record.  The  detailed  effect  of 
each  function  is  t  ••  •!  in  the  following  definitions.  A  command  or  request 

is  ne;  ne.'css.i  rv  for  the  query  of  a  shared  record.  A  check  of  the  shared 
record  list  will  indicate  if  the  record  is  available.  In  this  discussion 
tla  (oil-,  "update"  will  indicate  both  a  read  and  write,  while  "query"  implies 
a  re.nl  only. 


l)t  i  i  n  i  i  1  i*ii  /( 


The  REQUEST  r  .  ,  T 


command  issued  by  B,  to  B  results  in  the  following 


— j“k 


operat ions : 


If  for  all  S  .  containing  r., 
h1  J 

a.  r.  is  unmarked  in  S,„  .  and 

J  T,i 

b.  either  T  has  a  record  marked  in  S  .  or  m  (S  .)  <  ] M  |  —  1  for  |M  I  <2. 

k  r , i  i  x 

then  B.  marks  r.  in  all  S„  .  with  the  identifier  of  T.  and  transmits 
i  J  T ,  l  k 

a  POSITIVE  r .  , _ T  response  to  R  . 

j"  K  k 

Otherwise  B.  docs  not  respond  to  B, , 
l  k 

Definition  i  indicates  that  two  conditions  must  be  satisfied  before  a 
back-end  processor  can  signify  that  a  record  is  available: 

1.  The  record  must  not  be  claimed  by  another  task. 

2.  If  the  first  condition  is,  then  it  must  be  certified  that  granting 
control  of  the  record  to  would  not  cause  any  set  of  tasks  to  enter  a 
deadlock-prone  state. 

Dc-fir. :  t  ion  5 

The  DOCK  r.,  T  command,  when  issued  by  B,  to  B . ,  causes  r.  to  be  marked 

- j-  -  k  J  j 

,if,  locked  by  T  in  all  shared  record  lists  of  B.. 

k  J 

Di  ■  -  ii..t  j ,  >n  f> 


-he  UNLOCK  r.  command  causes  r.  to  be  unmarked  in  all  sliared  record  lists. 

"  '  1  J 

Tin'  purpose  of  i'.  a'EST  command  and  the  POSITIVE  response  is  to  confirm 
!).■  availability  of  a  record  with  higher  priority  tasks.  Since  the  distributed 
data  Duse  environment  permits  concurrent  asynchronous  operations  on  shared 
data,  a  barf -end  processor  must  query  those  back-ends  that  contain  higher 
prim  ity  tasks  which  interact  with  the  requesting  task  to  verify  the  status 
°f  t  h<-  i  ivi.nl.  If  the  record  is  available,  a  POSITIVE  response  is  sent.  If 
'  he  i'<. oil  i  ■  unavailable,  at  the  time  of  the  REQUEST  a  LOCK  command  would 
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already  beintram.it  to  the  back-end  of  the  requesting  task.  The  receipt 
of  a  LOCK  command  from  a  higher  priority  task  invalidates  a  REQUEST.  The 
LOCK  command  binds  a  record  to  a  task  while  UNLOCK  is  used  to  release  records. 

The  functions  and  responses  are  employed  by  the  following  algorithm 
to  prevent  the  distributed  DBMS  from  entering  a  deadlock  state. 


Algorithm  1 
PART  A 

Mien  task  T,  desires  to  update  shared  record  r.,  the  following  steps 
K  J 

must  be  taken  by  to  prevent  a  deadlock  state. 

1.  Check  if  r.  is  marked  in  any  S_  containing  r..  If  so,  T,  must 

J  T,k  J  k 

wait  until  B  receives  an  UNLOCK  r.  command.  Note  that  if  r.  is  marked  in  one 
K.  J  J 

S  ,  ,  it  is  marked  in  all  S  ,  . 

1 1  k  X  >  k- 

2.  If  3  ST  k>  such  that  m  (ST  k)  then  Tk  must  wait  until 

a  record  in  S  ,  is  unlocked. 

I 

3.  Mark  r.  with  the  identifier  of  T,  in  all  S  containing  r.. 

J  K  I  ,K.  j 

4.  For  all  higher  priority  tasks  in  any  S^  containing  T^, 
issue  a  REQUEST  r.,  T,  command  to  their  back-end  processors. 

j-  -  .k 

b.  Wait  for  POSITIVE  r.  responses  from  all  back-ends  of  step  4. 

t>.  If  while  waiting,  B  receives  a  LOCK  r.,  T.  command,  then  B  must  issue 

k  - j- —  l  K 

UNI.iU'K  r  commands  to  all  hack-end:;  which  have  transmitted  POSITIVE  r.  responses. 
•  J  J 


and  tlum  E  me.:.. 


I  i  ,  to  slop  1 . 


7.  If  while  wait  i ng  15,  receives  a  LOCK  r  ,  T.  command  (r  i  r.),  and 

k  - - — n  —  i  n  j 

m  (K  )  lx  |  -  1,  then  15,  must  issue  UNLOCK  r .  commands  to  all  back-ends 
T ,  k  r  k  - J 

which  have  transmitted  It’.'irnVi  r  responses  and  then  return  to  step  2. 

8 .  When  K,  receives  POSITIVE  r  responses  from  all  tasks  in  step  4  it 

k  . . “J 

ir.Mi  a  LOCK  r  command  to  all  lower  priority  tasks  in  any  X^,  containing  • 


16 


9.  in.  i  y  ilii'H  opera  to  ii|)on  r.. 


).  Upon  completion  of  the  opor.it  i  on:;  ill  step  9,  li^  issues 


ssucs  .in  UNLOCK  r  . 


command  to  all  task:;  in  any  X  containing  T  . 

1  K 


PART  B 

When  a  back-end  processor,  B.,  receives  a  REQUEST  r  ,  T  command,  it 

l  — a - j - k 

transmits  a  POSITIVE  r.  response  if  the  requirements  of  Def .  A  are  satisfied. 

If  a  POSITIVE  response  is  transmitted,  r  is  marked  with  the  identifier  of 

T,  in  all  S_ 
k  T,  i 


PART  C 

When  a  back-end  processor,  B.,  receives  a  LOCK  r  ,  T.  command  and  it  does 

i  n  '  k. 

not  have  a  REQUEST  r.  command  outstanding  such  that  the  conditions  in  steps 
. J 

6  or  7  of  Part  A  arise,  r  is  marked  with  the  identifier  of  T,  in  all  S_  .. 

n  k  T,i 


It  must  now  be  demonstrated  that  Algorithm  1  prevents  deadlock. 

Lemma  3 

Algorithm  1  prevents  the  system  from  entering  a  deadlock-prone  state. 


lVf.  1,  it  can  be  seen  that  under  all  circumstances,  for  any  S^,  } 

,  )  I.  (Si  for  all  back-ends.  B,  . 

Under  Ali’.iu  •  I  h. '  i.  ..  s.u-k-rend  processor  may  only  issue  a  LOCK  command 
i  l  ii.  ( E.j.  j  )  jx'.j.(-l  for  .ill  X^.  Thus  immediately  prior  to  the  issuance  of 

a  i.iK  U  i  i  ;  ,i  i, mil ,  1.  (S^.)-  |  X  |  -  1.  After,  the  LOCK  command  has  been  issued,  the 
in.  «x  i  i  a  .< :  value  of  L  (S.^,)  is  |  X.^,  |-1  for  all  S^,.  Therefore  according  to  Lemma  k,  il  lii 
sy. .i.-;:  operates  under  Algorithm  1,  it  cannot  enter  a  deadlock-prone  state. 
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Theorem  1 

Algorithm  1  prevents  the  system  from  entering  a  deadlock  state. 

Proof 

The  theorem  follows  immediately  from  Lemmas  1  and  3. 

The  following  example,  illustrates  the  operating  of  a  distributed  DBMS  with 
the  deadlock  prevention  algorithm  in  effect. 


Example  U 

Assume  the  set  of  tasks  in  Example  1.  We  will  follow  the  actions  of 
the  back-end  processors  B^,  B^  which  control  data  base  access  for  T^ ,  , 

and  respectively.  The  only  task  set  in  which  the  value  of  m  (S^)  can  be 
greater  than  2  is  =  {T^.T^,  T^}, 

In  the  example  under  consideration,  =  {r^,  r^,  r^}  .  For  notational 

convenience  each  S.  .  will  be  denoted  by  an  ordered  triple  in  which  the 

4,i 

first  element  corresponds  to  the  task,  if  any,  marking  r^,  the  second  element 
indicates  the  task  marking  r^,  while  the  final  element  represents  the  task 
marking  r^. 

Assume  the  following  set  of  references  in  the  sample  system. 


Time 


t 

t 

t 

t 

t 

t 


0 

0 

0 

10 

10 

]0 


Task 


Record  Referenced 


T3 


Oik  e  .i  I  .ink  rn  rivi'S  (In-  second  record  it  lias  requested,  it  requires  S 
time  1 1 1 1  I  l  : ;  to  complete  its  oper.it  ion  on  those  records. 
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For  purposes  ef  this  example,  the  delays  between  back-end  processors 
shown  below  arc  assumed. 

<— >  -  2  time  units 

<— ■>  -  3  time  units 

B2  >  B^  -  1  time  unit. 

Also,  let  the  task  priorities  be 

wv 

Civen  the  requests  listed  above  the  following  operations  are  performed 
in  the  system. 


1.  At  time  tg 

54.1  '  {1*  •  }  S4,2  *  {  *  >2)  s4,3  "  {  »3» 

issues  I.OCK  r,  ,  to  and  B^. 

Tj  proceeds  to  operate  on  r^. 

issues  REQUEST  r., ,  *I\  to  B^. 

Bj  issues  REQUEST  r  ^ ,  T^  to  and  • 

2.  At  time  t^, 

B2  receives  .REQUEST  r^.  ■ 

54.2  *  {  ’3'2  K 

Bj  transmits  a  POSITIVE  to  T^ . 


3.  At  time  t^, 

B1  receives  REQUEST  r? .  T  . 

S4,l  "  .2  )• 

Transmits  a  POSITIVE  r ^  to  B^ . 

B2  receives  a  LOCK  r^  . _ T^ . 


(1,3,2). 


4.  At  t  lim'  t 


20 


14 


3 


rccc-ivi-s  l-0£K_r}  ,  __T] 

S4)3  -*  {1,1,2}  . 


1  0  . 


At  time  t  ^  , 

issues  UNLOCK  and  U  NL  PC  K _ to 

S4,l  =  {’  ’  2)' 


11.  At  time 

B0  receives  UNLOCK  r  ^  and  UNLOCK  r  ^  and 
to  B 

S4  ,  2  =  ^  2  ’  ’  2 


12.  At  time  t  ^  g  , 

B.j  receives  UNLOCK  y  ^  and  UNLOCK  r  ^  and 
to  B  ^  and  B  ^ . 

3  =  {  ,  3,  2). 


1  3 


At  time  1 1 9  , 

H  .j  r  i*  c  v  i  v  v  s  _R  V.  QU  r  ST  r_  ^  , _ T  ^  • 

s  =  1 2  n ) 

4,1  '  ’  ’  ' 

is  issues  Pt.*S_lT  1\T.  v  ^  to  B^ . 

B  rece  i  \vr.  RK< 1  .  ,  T  . 

2  -  3  •  -  3 

S4,v  =  (2,1,2}. 

k2  issues  positive  t<>  b  . 


14  . 


At  t  line 


20’ 


1L  r ere  i  ves  POS  I  T  1  VI’  r 


f  rnm  B^ . 


and  B  . 


issues  RE  QUEST  r 


issues  REQUEST  r 
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15.  At  t imc  .  ’ 

receives  request  r^,  T^. 

SA>1  -  {2,3,2). 

B2  issues  rOSITIVF.  r^  to  B^. 

Bj  receives  POSITIVE  r^  from  B^  and  then  Issues  a  LOCK  r^ ,  T\  to  B^ 
and  proceeds  to  operate  upon  r^. 

16.  At  time  t^. 

B^  receives  LOCK  r^,  . 

SA  3  -  {2,3,2}. 

17.  At  time  t2A, 

B^  receives  POSITIVE  from  B^  and  then  proceeds  to  operate  on  r^. 

18.  At  time  t^, 

B2  issues  UNLOCK  r  ^  and  UNLOCK  r ^  to  B^  and  B^. 

sA2  -  t;  3,  ). 

19.  At  time  t22> 

Bj  receives  UNLOCK  and  U.V7.0CK  r.,  from  and  then  transmits  REQUEST 
Tj ,  Tj  to  B  ^  and  B^. 

SA>3  -{  ,3,3). 

20.  At  time  t2  , 

rci  dvrs  UNLOCK  r^  and  UNLOCK  r.,  from  B^. 

SA,1  ’  *  ,3, 

B2  receives  REQUEST  r;  L_?3 
B  Issues  a  POSITIVE  r  to  B  . 

t  •*  "  /  J 

SA2  -  {.3,3}, 


B.,  receives  POSITIVE  r.,  from  B„ 

J  '  /  • 

At  time  t^, 

B^  receives  REQUEST  r^ , _ from 

S.  =  {  3,  3}  . 

4,1 

transmits  POSITIVE  T-j  to  B^. 

At  time  t  , 

B^  receives  POSITIVE  r.,  from  and  then  begins  to  operate  upon 


- 1' 


7.  I.f  f  i  c  i file v  <<f  Deadlock  Prevent  ion  A1  gor  i  thro 

It  is  difficult  to  treat  the  efficiency  of  Algorithm  1  without 
experimental  evidence  from  a  prototype  distributed  DBMS.  There  arc  several 
critical  performance  factors  which  could  vary  among  distributed  DBMS  imple¬ 
mentations.  Network  topology,  degree  of  potential  sharing  among  application 
tasks,  and  subschema  size  are  among  the  system  parameters  that  will  have 
the  strongest  performance  effects. 

The  number  of  hack-end  processors  in  the  network  along  with  the 
physical  distance  and  type  of  connections  among  the  back-end  processors 
will  influence  the  amount  of  communication  overhead  resulting  from  dead¬ 
lock  prevention.  It  should  be  noted  that  only  back-end  processors  that 
execute  data  base  operations  for  tasks  that  share  data  have  a  need  to 
exchange  information  in  order  to  prevent  deadlock.  If  both  deadlock  prevention 
and  a  high  degree  of  efficiency  are  goals  of  a  system  design,  than  data  shared 
by  a  group  of  tasks  should  be  controlled  by  a  minimal  number  of  back-end 
processors.  (Ideally,  each  such  unit  of  shared  data  would  reside  on  the 
storage  of  a  single  back-end  processor.) 

One  environment  under  which  the  deadlock  prevention  algorithm  could 
degrade  performance  is  an  on-line  system  in  which  each  user  may  access 
any  record  in  the  entire  data  base.  In  this  situation,  the  deadlock  prevention 
algorithm  would  force  the  DBMS  to  operate  in  a  single  threaded  mode.  If  the 
sub-;, eh.  ma  concept  applied  to  this  unrestricted  on-line  environment,  the 

unde:,  i  r  call  1  <•  effects  of  deadlock  prevention  can  be  virtually  eliminated. 

Instead  of  permitting  each  on-line  command  unrestricted  access  to  the  entire 
data  base,  the  data  base  can  be  partitioned  into  a  logical  collection  of 
sub  schemas.  Whenever  a  user  issues  an  on-line  data  base  request,  the 


appropriate  sub- schema  is  invoked  prior  to  the  actual  execution  of  the  command. 
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The  sub-schemas  should  be  defined  to  encompass  only  that  portion  of  tlu*  data 
base  that  the  command  may  access.  For  example,  if  an  airline  reservation 
clerk  updates  a  passenger  list,  the  sub-schema  would  contain  the  passenger 
list  for  a  given  flight. 

in  an  on-line  environment  in  which  the  data  base  has  been  partitioned 
into  sub-schemas  the  user  need  not  sacrifice  any  flexibility  of  data  access. 
Kadi  user  would  interface  witli  a  re-entrant  control  program  which  parses 
each  request,  invokes  the  applicable  sub-schema  and  then  activates  a  host 
task  to  execute  the  request.  By  organizing  an  on-line  system  in  this  manner 
the  negative  performance  effects  of  deadlock  prevention  are  minimized. 

8  .  Cone  1  nr.  1  on 

The  approach  to  deadlock  prevention  described  here  is  a  dynamic  pre- 
claim  technique  [8]  since  it  implicitly  locks  a  set  of  records  that  could 
be  required  for  an  operation.  The  deadlock  prevention  scheme  proposed 
i or  distributed  data  base  systems  has  some  similarity  to  the  data  base  dead- 
loci,  prevention  mechanisms  of  Lomet  [16]  and  Chu  and  Ohlmacher  [4].  Lomet 
employs  a  graph- theoretical  technique  to  avoiding  deadlock.  However,  the 
m  i  or.:.  •.  t  ion  contained  in  the  graphs  is  essentially  the  same  as  that  maintained 
in  the  snared  record  list.  The  version  of  Lomet's  algorithm  presented  in 
Kefereiuo  | 16]  does  not  consider  the  performance  effects  of  operation  in  a 
list  rihuted  environment  . 

The  process  set  of  Chu  and  Ohlmacher  is  very  close  conceptually  to  the 
sh.i.rl  ,  eeurd  list.  The  algorithm  of  Chu  and  Ohlmacher  is  also  intended 
(or  distributed  systems.  The  technique  developed  here  differs  from  the 
approach  of  (dm  and  Ohlmacher  in  that  it  operates  at  the  record  level  and 
in  thni  a  requesting  task  is  given  control  of  only  that  part  of  its  shared 
reci>i  d  list  necessary  to  avoid  a  dead  1  uck-prone  state.  The  feasibility  of 
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dot ah. iso  sharing  at  the  record  level  liar,  been  studied  using  simulation  by 
S homer  and  Collmcyer  [20],  who  projected  that  even  with  a  high  degree  of 
contention,  performance  degradation  due  to  overhead  would  bo  minimal. 

Presently,  several  commercially  available,  single- machine  data  base  systems 
provide  data  sharing  and  locking  at  the  record  level  [22], 

The  deadlock  prevention  algorithm  is  intended  to  prevent  all  possible 
deadlocks  while  allowing  maximum  data  base  sharing  at  the  record  level. 

The  results  of  the  Section  6  demonstrate  that  the  algorithm  meets  these 
criteria.  Naturally,  deadlock  prevention  incurs  some  overhead.  However, 
careful  planning  by  the  designer  of  a  distributed  data  base  application 
who  is  cognizant  of  the  operation  of  the  prevention  algorithm  can  result 
in  minimization  of  the  overhead.  For  a  distributed  DBMS  application  to 
operate  efficiently  under  the  deadlock  prevention  algorithm,  it  is  important 
that  the  data  base,  be  partitioned  into  sub-schemas.  However,  once  the  sub¬ 
schema  is  defined,  the  individual  application  programs  need  not  be  aware  of 
the  mechanics  of  the  deadlock  prevention  algorithm. 

Due  to  the  infrequency  of  deadlock  situations,  a  deadlock  detection 
schema  requires  less  overhead  than  deadlock  prevention  in  a  single  machine 
DU'I:  .  tiowevoi  ,  the  uncertain  amount  of  overhead  in  distributed  DBMS  rollback 
added  to  the  fixed  overhead  of  the  timeout  mechanism,  both  of  which  are  neces¬ 
sary  operation.',  in  deadlock  detection,  leads  to  proposing  deadlock  prevention  ns 
the  more  sat  inf  ac •  ay  mechanism  for  handling  deadlocks  in  a  distributed 
data  base.  Thu  subject  of  rollback  for  a  distributed  DBMS  is  treated  in 
reference  [17],  which  presents  an  algorithm  for  minimizing  the  overhead  of 
the  rollback  operation. 

The  lists  computed  by  the  prevention  algorithm  have  potential  application 
in  two  critical  design  areas  of  distributed  data  base.  The  size  of  a  task 


Int  i*r.-»ct  ion  list,  X^,  if.  an  indicator  of  t ho  amount  of  interference 
resulting  from  the  activation  of  a  data  banc;  task.  Since  task  interference 
has  an  effect  in  system  performance,  the  scheduler  could  use  the  size  of 
X  as  one  of  the  weighting  factors  in  the  scheduling  algorithm. 

The  contents  of  the  shared  record  list  would  be  of  use  to  a  prepaging 
memory  manager  [23].  Records  that  are  unlocked,  yet  only  requestable  by  a 
single  task,  could  be  transmitted  to  the  page  buffer  associated  with  that 
task.  The  shared  record  list  would  be  particularly  valuable  when  used  in 
conjunction  with  a  Markovian  paging  model  [1,11]. 
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